Request


This report includes a description and the results of the analyses requested by Elisa Jentho at 01/06/2021:

  1. Plot marker genes upon integrated Cre3 and Lox2 Plasmodium chabaudi scRNA-seq samples with Seurat:

    • Markers_for_UMAP.xlsx: plot the list of markers in the overlapped UMAP of the integrated Cre3 and Lox2 samples.

    • More_markers.xlsx: plot the list of markers separately across the two UMAPs of the integrated Cre3 and Lox2 samples.






About this report


The R version used was 3.6.3 (R Core Team 2020). This report was built with the rmarkdown R package (v.2.5) (Allaire et al. 2020; Xie, Allaire, and Grolemund 2018; Xie, Dervieux, and Riederer 2020). See the full list of packages and versions used at the end of this report - R packages used and respective versions.

The integrated 10x Lox2 and Cre3 samples processed previously were imported to plot the markers upon the UMAPs with the Seurat R package (v.4.0.0) (Satija et al. 2015; Butler et al. 2018; Stuart et al. 2019; Hao et al. 2020).

The figures and tables displayed through the report can be download in pdf and tab-separated format by clicking on the bottom left Download plot and Download table button that appears after each plot and table, respectively.

## Set seed 
set.seed(seed = 1024) # to keep reproducibility
## Import packages 

library("dplyr", quietly = TRUE)
library("DT", quietly = TRUE) # package to print nice in the html report 
library("Seurat", quietly = TRUE)
library("gplots", quietly = TRUE)
library("Matrix", quietly = TRUE)
library("ggplot2", quietly = TRUE)
library("tidyr", quietly = TRUE)
library("readxl", quietly = TRUE)






Data analysis


Import data: Cre3 & Lox2 integrated samples


## Import Seurat object with the re3 & Lox2 integrated samples

# import obj integrated
seu <- readRDS(file = "../results/int/R_objects/seu.rds")



Import data: Marker lists


## Import marker lists 

# marker lists to import 
markers2import <- list.files(path = "../data/markers_to_plot", full.names = TRUE)
names(markers2import) <- c("int", "sep") 
# "int": integrated/overlapped UMAP marker list | "sep": separated UMAP marker list

# import lists
markers <- list(
  "int" = read_excel(path = markers2import[1], col_names = TRUE), 
  "sep" = read_excel(path = markers2import[2], col_names = FALSE)
)

# parse lists
markers$int = markers$int %>% select_if(is.character)
colnames(markers$sep) <- "Genes"

The two lists of markers can be found below.








Plot markers by cluster


To plot markers it was used the integrated Seurat object with the Cre3 and Lox2 samples. The assay used was RNA and the slot data with the counts normalized with the LogNormalize method (read more).

The individual plots of the cluster markers highlighted through the integrated UMAPs can be found at: results/markers/int/ (with each cluster folder holding their respective markers).

## Plot list of markers by cluster

# folder to save results
plot_markers <- "../results/markers"
if ( ! dir.exists(plot_markers) ) dir.create(plot_markers)

# list to save plots
marker_plots[["sep"]] <- marker_plots[["int"]] <- marker_plots <- list()

# check assay 
stopifnot(DefaultAssay(seu) == "RNA")

for ( clt in colnames(markers$int) ) { # loop over cols/cluster markers
  clt_name <- gsub(pattern = " ", replacement = "-", x = clt)
  
  # create new folder for the cluster
  plot_clt_folder <- paste(plot_markers, paste0("int/", clt_name), sep = "/")
  if ( ! dir.exists(plot_clt_folder) ) dir.create(plot_clt_folder, recursive = TRUE)
  
  marker_genes <- markers$int[,clt,drop=TRUE] # retrieve the marker genes for each col/cluster
  
  # save to list
  marker_plots$int[[clt_name]] <- list()
  
  for ( mkr in marker_genes ) { # loop over marker genes within a cluster
    if( ! is.na(mkr) ) { # plot them only if are not 'NA'
      stopifnot( mkr %in% row.names(seu) ) # stop if gene does not exist in the Seurat obj
      marker_plots$int[[clt_name]][[mkr]] <- FeaturePlot(object = seu, features = mkr, 
                                                         slot = "data", reduction = "umap", 
                                                         pt.size = 0.001)
      ggsave(filename = paste(plot_clt_folder, paste0(clt_name, "_", mkr, "_gene_UMAP.pdf"), sep = "/"), 
             plot = marker_plots$int[[clt_name]][[mkr]], height = 5, width = 5)
    }
  }
}



  • Cluster 0


cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-0`)



  • Cluster 1


cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-1`)



  • Cluster 2


cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-2`)



  • Cluster 3


cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-3`)



  • Cluster 4


cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-4`)



  • Cluster 5


cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-5`)



  • Cluster 6


cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-6`)



  • Cluster 7


cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-7`)



  • Cluster 8


cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-8`)






Plot other markers split by sample


To plot markers it was used the integrated Seurat object with the Cre3 and Lox2 samples. The assay used was RNA and the slot data with the counts normalized with the LogNormalize method (read more).

The individual plots of the markers split by sample highlighted through the integrated UMAPs can be found at: results/markers/sep/.

WARNING: The genes PCHAS-051500 and PCHAS-141240 provided as markers to plot were not found across the scRNA-seq data sets, and, thus they were not plotted.

## Plot list of markers by sample

# folder to save results
plot_sep_markers <- "../results/markers/sep"
if ( ! dir.exists(plot_sep_markers) ) dir.create(plot_sep_markers)

# Seurat: split objects by sample
sobjList <- SplitObject(seu, split.by = "orig.ident") 

# check assay 
stopifnot(DefaultAssay(seu) == "RNA")
stopifnot( c(DefaultAssay(sobjList$Cre3), DefaultAssay(sobjList$Lox2)) == "RNA" )
for ( mkr in markers$sep$Genes ) { # loop over marker genes within a cluster
  #stopifnot( mkr %in% row.names(seu) ) # stop if gene does not exist in the Seurat obj
  if ( mkr %in% row.names(seu) ) {
    marker_plots$sep[[mkr]] <- list()
    for ( sobj in rev(names(sobjList)) ) {
            marker_plots$sep[[mkr]][[sobj]] <- FeaturePlot(object = sobjList[[sobj]], features = mkr,
                                                           slot = "data", reduction = "umap", 
                                                           pt.size = 0.001)
            ggsave(filename = paste(plot_sep_markers, paste0(sobj, "_", mkr, "_gene_UMAP_split_by_sample.pdf"), sep = "/"), 
                   plot = marker_plots$sep[[mkr]][[sobj]], height = 5, width = 5)
    }
  }
}



  • Marker PCHAS-021240


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-021240`)



  • Marker PCHAS-121330


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-121330`)



  • Marker PCHAS-100390


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-100390`)



  • Marker PCHAS-140270


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-140270`)



  • Marker PCHAS-143360


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-143360`)



  • Marker PCHAS-110810


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-110810`)



  • Marker PCHAS-040720


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-040720`)



  • Marker PCHAS-031680


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-031680`)



  • Marker PCHAS-122070


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-122070`)



  • Marker PCHAS-142660


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-142660`)



  • Marker PCHAS-146340


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-146340`)



  • Marker PCHAS-072660


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-072660`)



  • Marker PCHAS-081890


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-081890`)



  • Marker PCHAS-113160


cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-113160`)






Deliver

Folders inside the project folder:

  • results: folder that contains all the results obtained in this analysis;

  • report: folder that contains the report and code used herein;

  • data: folder that contains the data used herein;

  • scripts: folder that contains the scripts used herein;

  • info: folder that contains some useful information (i.e., papers, etc) used herein;






R packages used and respective versions

## R packages and versions used in these analyses

sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] readxl_1.3.1       tidyr_1.1.2        ggplot2_3.3.2      Matrix_1.3-2      
## [5] gplots_3.1.1       SeuratObject_4.0.0 Seurat_4.0.0       DT_0.16           
## [9] dplyr_0.8.5       
## 
## loaded via a namespace (and not attached):
##   [1] nlme_3.1-152         bitops_1.0-6         matrixStats_0.57.0  
##   [4] RcppAnnoy_0.0.18     RColorBrewer_1.1-2   httr_1.4.2          
##   [7] sctransform_0.3.2    tools_3.6.3          R6_2.5.0            
##  [10] irlba_2.3.3          rpart_4.1-15         KernSmooth_2.23-20  
##  [13] uwot_0.1.10          mgcv_1.8-35          lazyeval_0.2.2      
##  [16] colorspace_1.4-1     withr_2.4.1          tidyselect_1.1.0    
##  [19] gridExtra_2.3        compiler_3.6.3       plotly_4.9.3        
##  [22] labeling_0.4.2       caTools_1.18.0       scales_1.1.1        
##  [25] lmtest_0.9-38        spatstat.data_1.7-0  ggridges_0.5.3      
##  [28] pbapply_1.4-3        goftest_1.2-2        spatstat_1.64-1     
##  [31] stringr_1.4.0        digest_0.6.27        spatstat.utils_2.0-0
##  [34] rmarkdown_2.5        pkgconfig_2.0.3      htmltools_0.5.1.1   
##  [37] parallelly_1.22.0    fastmap_1.1.0        htmlwidgets_1.5.3   
##  [40] rlang_0.4.10         shiny_1.6.0          farver_2.0.3        
##  [43] zoo_1.8-8            jsonlite_1.7.2       crosstalk_1.1.0.1   
##  [46] gtools_3.8.2         ica_1.0-2            magrittr_2.0.1      
##  [49] patchwork_1.1.1      Rcpp_1.0.6           munsell_0.5.0       
##  [52] abind_1.4-5          reticulate_1.18      lifecycle_0.2.0     
##  [55] stringi_1.5.3        yaml_2.2.1           MASS_7.3-54         
##  [58] Rtsne_0.15           plyr_1.8.6           grid_3.6.3          
##  [61] parallel_3.6.3       listenv_0.8.0        promises_1.1.1      
##  [64] ggrepel_0.9.1        crayon_1.3.4         deldir_0.2-9        
##  [67] miniUI_0.1.1.1       lattice_0.20-44      cowplot_1.1.1       
##  [70] splines_3.6.3        tensor_1.5           knitr_1.30          
##  [73] klippy_0.0.0.9500    pillar_1.4.7         igraph_1.2.6        
##  [76] future.apply_1.7.0   reshape2_1.4.4       codetools_0.2-18    
##  [79] leiden_0.3.7         glue_1.4.2           evaluate_0.14       
##  [82] data.table_1.13.2    vctrs_0.3.6          png_0.1-7           
##  [85] httpuv_1.5.5         cellranger_1.1.0     polyclip_1.10-0     
##  [88] gtable_0.3.0         RANN_2.6.1           purrr_0.3.4         
##  [91] scattermore_0.7      future_1.21.0        assertthat_0.2.1    
##  [94] xfun_0.19            mime_0.9             xtable_1.8-4        
##  [97] later_1.1.0.1        survival_3.2-11      viridisLite_0.3.0   
## [100] tibble_3.0.6         cluster_2.1.2        globals_0.14.0      
## [103] fitdistrplus_1.1-3   ellipsis_0.3.1       ROCR_1.0-11








References

Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2020. Rmarkdown: Dynamic Documents for R. https://github.com/rstudio/rmarkdown.

Butler, Andrew, Paul Hoffman, Peter Smibert, Efthymia Papalexi, and Rahul Satija. 2018. “Integrating Single-Cell Transcriptomic Data Across Different Conditions, Technologies, and Species.” Nature Biotechnology 36: 411–20. https://doi.org/10.1038/nbt.4096.

Hao, Yuhan, Stephanie Hao, Erica Andersen-Nissen, William M. Mauck III, Shiwei Zheng, Andrew Butler, Maddie J. Lee, et al. 2020. “Integrated Analysis of Multimodal Single-Cell Data.” bioRxiv. https://doi.org/10.1101/2020.10.12.335331.

R Core Team. 2020. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Satija, Rahul, Jeffrey A Farrell, David Gennert, Alexander F Schier, and Aviv Regev. 2015. “Spatial Reconstruction of Single-Cell Gene Expression Data.” Nature Biotechnology 33: 495–502. https://doi.org/10.1038/nbt.3192.

Stuart, Tim, Andrew Butler, Paul Hoffman, Christoph Hafemeister, Efthymia Papalexi, William M Mauck III, Yuhan Hao, Marlon Stoeckius, Peter Smibert, and Rahul Satija. 2019. “Comprehensive Integration of Single-Cell Data.” Cell 177: 1888–1902. https://doi.org/10.1016/j.cell.2019.05.031.

Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.

Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook.